Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations999
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory720.8 KiB
Average record size in memory738.9 B

Variable types

Text8
Numeric6
Categorical2
Boolean4

Alerts

has_missing_fields has constant value "False"Constant
has_transcript_issues has constant value "False"Constant
has_summary_issues has constant value "False"Constant
compression_ratio is highly overall correlated with summary_length and 1 other fieldsHigh correlation
summary_length is highly overall correlated with compression_ratio and 1 other fieldsHigh correlation
transcript_length is highly overall correlated with word_count_transcriptHigh correlation
word_count_summary is highly overall correlated with compression_ratio and 1 other fieldsHigh correlation
word_count_transcript is highly overall correlated with transcript_lengthHigh correlation
priority is highly imbalanced (75.6%)Imbalance
line_number is uniformly distributedUniform
record_id has unique valuesUnique
line_number has unique valuesUnique

Reproduction

Analysis started2025-10-01 10:16:24.280579
Analysis finished2025-10-01 10:16:25.727778
Duration1.45 second
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

record_id
Text

Unique 

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size55.6 KiB
2025-10-01T13:16:25.801798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length8
Mean length7.8928929
Min length6

Characters and Unicode

Total characters7885
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique999 ?
Unique (%)100.0%

Sample

1st rowline_1
2nd rowline_2
3rd rowline_3
4th rowline_4
5th rowline_5
ValueCountFrequency (%)
line_81
 
0.1%
line_10001
 
0.1%
line_11
 
0.1%
line_21
 
0.1%
line_31
 
0.1%
line_41
 
0.1%
line_51
 
0.1%
line_9851
 
0.1%
line_9861
 
0.1%
line_9871
 
0.1%
Other values (989)989
99.0%
2025-10-01T13:16:25.922081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l999
12.7%
i999
12.7%
n999
12.7%
e999
12.7%
_999
12.7%
1300
 
3.8%
3300
 
3.8%
7300
 
3.8%
4300
 
3.8%
5300
 
3.8%
Other values (5)1390
17.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)7885
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
l999
12.7%
i999
12.7%
n999
12.7%
e999
12.7%
_999
12.7%
1300
 
3.8%
3300
 
3.8%
7300
 
3.8%
4300
 
3.8%
5300
 
3.8%
Other values (5)1390
17.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7885
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
l999
12.7%
i999
12.7%
n999
12.7%
e999
12.7%
_999
12.7%
1300
 
3.8%
3300
 
3.8%
7300
 
3.8%
4300
 
3.8%
5300
 
3.8%
Other values (5)1390
17.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7885
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
l999
12.7%
i999
12.7%
n999
12.7%
e999
12.7%
_999
12.7%
1300
 
3.8%
3300
 
3.8%
7300
 
3.8%
4300
 
3.8%
5300
 
3.8%
Other values (5)1390
17.6%

line_number
Real number (ℝ)

Uniform  Unique 

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean500.78178
Minimum1
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:25.960423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile50.9
Q1251.5
median501
Q3750.5
95-th percentile950.1
Maximum1000
Range999
Interquartile range (IQR)499

Descriptive statistics

Standard deviation288.82654
Coefficient of variation (CV)0.57675129
Kurtosis-1.1992711
Mean500.78178
Median Absolute Deviation (MAD)250
Skewness-0.0020031687
Sum500281
Variance83420.77
MonotonicityStrictly increasing
2025-10-01T13:16:25.997401image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10001
 
0.1%
11
 
0.1%
21
 
0.1%
31
 
0.1%
41
 
0.1%
51
 
0.1%
61
 
0.1%
9841
 
0.1%
9831
 
0.1%
9821
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
10001
0.1%
9991
0.1%
9981
0.1%
9971
0.1%
9961
0.1%
9951
0.1%
9941
0.1%
9931
0.1%
9921
0.1%
9911
0.1%

transcript_length
Real number (ℝ)

High correlation 

Distinct243
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean631.57357
Minimum462
Maximum953
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:26.033844image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum462
5-th percentile544
Q1591
median627
Q3664.5
95-th percentile730
Maximum953
Range491
Interquartile range (IQR)73.5

Descriptive statistics

Standard deviation59.036998
Coefficient of variation (CV)0.093476041
Kurtosis2.2069959
Mean631.57357
Median Absolute Deviation (MAD)37
Skewness0.8240949
Sum630942
Variance3485.3671
MonotonicityNot monotonic
2025-10-01T13:16:26.156377image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62314
 
1.4%
66311
 
1.1%
57711
 
1.1%
61011
 
1.1%
58511
 
1.1%
61310
 
1.0%
63110
 
1.0%
63910
 
1.0%
60410
 
1.0%
64410
 
1.0%
Other values (233)891
89.2%
ValueCountFrequency (%)
4621
0.1%
4841
0.1%
4871
0.1%
4901
0.1%
5071
0.1%
5081
0.1%
5131
0.1%
5171
0.1%
5181
0.1%
5231
0.1%
ValueCountFrequency (%)
9531
 
0.1%
9381
 
0.1%
9191
 
0.1%
8451
 
0.1%
8371
 
0.1%
8252
0.2%
8153
0.3%
8141
 
0.1%
8111
 
0.1%
8041
 
0.1%

summary_length
Real number (ℝ)

High correlation 

Distinct234
Distinct (%)23.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean251.2953
Minimum78
Maximum478
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:26.191474image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum78
5-th percentile156
Q1218
median255
Q3285
95-th percentile338.1
Maximum478
Range400
Interquartile range (IQR)67

Descriptive statistics

Standard deviation53.628761
Coefficient of variation (CV)0.21340933
Kurtosis0.41930783
Mean251.2953
Median Absolute Deviation (MAD)32
Skewness-0.082231057
Sum251044
Variance2876.044
MonotonicityNot monotonic
2025-10-01T13:16:26.230742image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27214
 
1.4%
28213
 
1.3%
25113
 
1.3%
26413
 
1.3%
23912
 
1.2%
28712
 
1.2%
26111
 
1.1%
27011
 
1.1%
25711
 
1.1%
25411
 
1.1%
Other values (224)878
87.9%
ValueCountFrequency (%)
781
0.1%
822
0.2%
971
0.1%
1021
0.1%
1101
0.1%
1181
0.1%
1221
0.1%
1231
0.1%
1282
0.2%
1301
0.1%
ValueCountFrequency (%)
4781
0.1%
4471
0.1%
4151
0.1%
4081
0.1%
3961
0.1%
3911
0.1%
3841
0.1%
3761
0.1%
3732
0.2%
3721
0.1%

compression_ratio
Real number (ℝ)

High correlation 

Distinct975
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.39999268
Minimum0.12206573
Maximum0.80067002
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:26.268825image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.12206573
5-th percentile0.25258602
Q10.34228299
median0.40536278
Q30.45650519
95-th percentile0.53403136
Maximum0.80067002
Range0.67860429
Interquartile range (IQR)0.1142222

Descriptive statistics

Standard deviation0.087763369
Coefficient of variation (CV)0.21941244
Kurtosis0.45074473
Mean0.39999268
Median Absolute Deviation (MAD)0.056856375
Skewness0.028797983
Sum399.59268
Variance0.0077024089
MonotonicityNot monotonic
2025-10-01T13:16:26.307815image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.33333333334
 
0.4%
0.39836065573
 
0.3%
0.3753
 
0.3%
0.35634028892
 
0.2%
0.28299120232
 
0.2%
0.29420970272
 
0.2%
0.39870340362
 
0.2%
0.27504244482
 
0.2%
0.46153846152
 
0.2%
0.45340501792
 
0.2%
Other values (965)975
97.6%
ValueCountFrequency (%)
0.12206572771
0.1%
0.12913385831
0.1%
0.13509060961
0.1%
0.15569823431
0.1%
0.16346153851
0.1%
0.17770597741
0.1%
0.18537859011
0.1%
0.19096209911
0.1%
0.19793205321
0.1%
0.20028208741
0.1%
ValueCountFrequency (%)
0.80067001681
0.1%
0.72676579931
0.1%
0.70172684461
0.1%
0.65768621241
0.1%
0.65700483091
0.1%
0.62694300521
0.1%
0.62056737591
0.1%
0.61191335741
0.1%
0.60975609761
0.1%
0.60539629011
0.1%

word_count_transcript
Real number (ℝ)

High correlation 

Distinct64
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.54655
Minimum83
Maximum166
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:26.345912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum83
5-th percentile97
Q1105
median112
Q3118
95-th percentile130
Maximum166
Range83
Interquartile range (IQR)13

Descriptive statistics

Standard deviation10.482622
Coefficient of variation (CV)0.093140322
Kurtosis1.9626728
Mean112.54655
Median Absolute Deviation (MAD)7
Skewness0.73697253
Sum112434
Variance109.88536
MonotonicityNot monotonic
2025-10-01T13:16:26.381753image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11345
 
4.5%
11545
 
4.5%
11442
 
4.2%
11142
 
4.2%
10640
 
4.0%
11839
 
3.9%
10838
 
3.8%
11237
 
3.7%
11637
 
3.7%
10736
 
3.6%
Other values (54)598
59.9%
ValueCountFrequency (%)
832
 
0.2%
851
 
0.1%
881
 
0.1%
891
 
0.1%
902
 
0.2%
914
0.4%
921
 
0.1%
937
0.7%
942
 
0.2%
958
0.8%
ValueCountFrequency (%)
1661
0.1%
1622
0.2%
1551
0.1%
1501
0.1%
1491
0.1%
1482
0.2%
1462
0.2%
1451
0.1%
1441
0.1%
1421
0.1%

word_count_summary
Real number (ℝ)

High correlation 

Distinct56
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.431431
Minimum13
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2025-10-01T13:16:26.416394image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile25
Q135
median41
Q346
95-th percentile56
Maximum75
Range62
Interquartile range (IQR)11

Descriptive statistics

Standard deviation9.1595222
Coefficient of variation (CV)0.22654459
Kurtosis0.19744495
Mean40.431431
Median Absolute Deviation (MAD)6
Skewness-0.028034678
Sum40391
Variance83.896847
MonotonicityNot monotonic
2025-10-01T13:16:26.452333image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4156
 
5.6%
4351
 
5.1%
4450
 
5.0%
4050
 
5.0%
3945
 
4.5%
4244
 
4.4%
4543
 
4.3%
3838
 
3.8%
3738
 
3.8%
4637
 
3.7%
Other values (46)547
54.8%
ValueCountFrequency (%)
132
 
0.2%
141
 
0.1%
161
 
0.1%
172
 
0.2%
183
 
0.3%
191
 
0.1%
202
 
0.2%
215
0.5%
2210
1.0%
239
0.9%
ValueCountFrequency (%)
751
 
0.1%
711
 
0.1%
701
 
0.1%
681
 
0.1%
671
 
0.1%
641
 
0.1%
631
 
0.1%
622
 
0.2%
613
0.3%
607
0.7%

name
Text

Distinct65
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size52.8 KiB
2025-10-01T13:16:26.502335image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length19
Median length18
Mean length4.988989
Min length3

Characters and Unicode

Total characters4984
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)3.0%

Sample

1st rowAhmed
2nd rowDaniel
3rd rowDaniel
4th rowJane
5th rowJohn
ValueCountFrequency (%)
john178
17.7%
james136
13.5%
joseph109
10.8%
sam80
 
7.9%
simon68
 
6.7%
samuel59
 
5.9%
david57
 
5.7%
jacob30
 
3.0%
peter26
 
2.6%
grace22
 
2.2%
Other values (54)243
24.1%
2025-10-01T13:16:26.582425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a588
11.8%
J506
10.2%
o495
9.9%
e477
9.6%
m460
9.2%
n361
 
7.2%
s347
 
7.0%
h338
 
6.8%
S229
 
4.6%
i177
 
3.6%
Other values (31)1006
20.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)4984
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a588
11.8%
J506
10.2%
o495
9.9%
e477
9.6%
m460
9.2%
n361
 
7.2%
s347
 
7.0%
h338
 
6.8%
S229
 
4.6%
i177
 
3.6%
Other values (31)1006
20.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4984
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a588
11.8%
J506
10.2%
o495
9.9%
e477
9.6%
m460
9.2%
n361
 
7.2%
s347
 
7.0%
h338
 
6.8%
S229
 
4.6%
i177
 
3.6%
Other values (31)1006
20.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4984
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a588
11.8%
J506
10.2%
o495
9.9%
e477
9.6%
m460
9.2%
n361
 
7.2%
s347
 
7.0%
h338
 
6.8%
S229
 
4.6%
i177
 
3.6%
Other values (31)1006
20.2%
Distinct81
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size55.5 KiB
2025-10-01T13:16:26.642850image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length22
Mean length7.7897898
Min length3

Characters and Unicode

Total characters7782
Distinct characters42
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)4.2%

Sample

1st rowMombasa
2nd rowKisumu
3rd rowMombasa
4th rowKisumu
5th rowNarok
ValueCountFrequency (%)
mombasa293
24.7%
mwanza266
22.4%
kisumu165
13.9%
kenya88
 
7.4%
tanzania45
 
3.8%
eldoret38
 
3.2%
kampala34
 
2.9%
nairobi26
 
2.2%
mbeya25
 
2.1%
tororo20
 
1.7%
Other values (60)188
15.8%
2025-10-01T13:16:26.740000image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a1679
21.6%
M629
 
8.1%
s508
 
6.5%
m506
 
6.5%
n477
 
6.1%
o463
 
5.9%
u429
 
5.5%
b372
 
4.8%
i361
 
4.6%
z315
 
4.0%
Other values (32)2043
26.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)7782
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a1679
21.6%
M629
 
8.1%
s508
 
6.5%
m506
 
6.5%
n477
 
6.1%
o463
 
5.9%
u429
 
5.5%
b372
 
4.8%
i361
 
4.6%
z315
 
4.0%
Other values (32)2043
26.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7782
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a1679
21.6%
M629
 
8.1%
s508
 
6.5%
m506
 
6.5%
n477
 
6.1%
o463
 
5.9%
u429
 
5.5%
b372
 
4.8%
i361
 
4.6%
z315
 
4.0%
Other values (32)2043
26.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7782
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a1679
21.6%
M629
 
8.1%
s508
 
6.5%
m506
 
6.5%
n477
 
6.1%
o463
 
5.9%
u429
 
5.5%
b372
 
4.8%
i361
 
4.6%
z315
 
4.0%
Other values (32)2043
26.3%

issue
Categorical

Distinct49
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size61.6 KiB
Child labor
315 
Child Labor
261 
Forced child marriage
90 
Emotional abuse
85 
Child Marriage
48 
Other values (44)
200 

Length

Max length51
Median length11
Mean length14.058058
Min length7

Characters and Unicode

Total characters14044
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)2.6%

Sample

1st rowChild Labor
2nd rowForced child labor
3rd rowChild labor
4th rowChild labor
5th rowChild Labor

Common Values

ValueCountFrequency (%)
Child labor315
31.5%
Child Labor261
26.1%
Forced child marriage90
 
9.0%
Emotional abuse85
 
8.5%
Child Marriage48
 
4.8%
Forced child labor43
 
4.3%
Child marriage32
 
3.2%
Forced Child Marriage24
 
2.4%
Neglect19
 
1.9%
Forced Child Labor15
 
1.5%
Other values (39)67
 
6.7%

Length

2025-10-01T13:16:26.777907image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
child861
38.0%
labor642
28.3%
marriage220
 
9.7%
forced189
 
8.3%
abuse127
 
5.6%
emotional116
 
5.1%
neglect30
 
1.3%
and21
 
0.9%
physical15
 
0.7%
of10
 
0.4%
Other values (13)37
 
1.6%

Most occurring characters

ValueCountFrequency (%)
l1407
10.0%
a1378
9.8%
r1275
9.1%
1269
9.0%
i1240
8.8%
o1098
 
7.8%
d1083
 
7.7%
h878
 
6.3%
b771
 
5.5%
C721
 
5.1%
Other values (25)2924
20.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)14044
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
l1407
10.0%
a1378
9.8%
r1275
9.1%
1269
9.0%
i1240
8.8%
o1098
 
7.8%
d1083
 
7.7%
h878
 
6.3%
b771
 
5.5%
C721
 
5.1%
Other values (25)2924
20.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)14044
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
l1407
10.0%
a1378
9.8%
r1275
9.1%
1269
9.0%
i1240
8.8%
o1098
 
7.8%
d1083
 
7.7%
h878
 
6.3%
b771
 
5.5%
C721
 
5.1%
Other values (25)2924
20.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)14044
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
l1407
10.0%
a1378
9.8%
r1275
9.1%
1269
9.0%
i1240
8.8%
o1098
 
7.8%
d1083
 
7.7%
h878
 
6.3%
b771
 
5.5%
C721
 
5.1%
Other values (25)2924
20.8%
Distinct102
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size64.2 KiB
2025-10-01T13:16:26.823251image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length47
Median length45
Mean length16.707708
Min length5

Characters and Unicode

Total characters16691
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)4.3%

Sample

1st rowChild Exploitation
2nd rowChild Labor
3rd rowLabor exploitation
4th rowChild exploitation
5th rowWork Exploitation
ValueCountFrequency (%)
child524
24.5%
labor441
20.6%
exploitation386
18.0%
protection191
 
8.9%
abuse106
 
5.0%
marriage102
 
4.8%
forced81
 
3.8%
emotional53
 
2.5%
workplace40
 
1.9%
violation29
 
1.4%
Other values (31)186
 
8.7%
2025-10-01T13:16:26.904325image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o1944
11.6%
i1825
 
10.9%
a1314
 
7.9%
t1300
 
7.8%
l1180
 
7.1%
1140
 
6.8%
r1042
 
6.2%
e827
 
5.0%
n695
 
4.2%
d622
 
3.7%
Other values (35)4802
28.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)16691
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o1944
11.6%
i1825
 
10.9%
a1314
 
7.9%
t1300
 
7.8%
l1180
 
7.1%
1140
 
6.8%
r1042
 
6.2%
e827
 
5.0%
n695
 
4.2%
d622
 
3.7%
Other values (35)4802
28.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)16691
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o1944
11.6%
i1825
 
10.9%
a1314
 
7.9%
t1300
 
7.8%
l1180
 
7.1%
1140
 
6.8%
r1042
 
6.2%
e827
 
5.0%
n695
 
4.2%
d622
 
3.7%
Other values (35)4802
28.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)16691
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o1944
11.6%
i1825
 
10.9%
a1314
 
7.9%
t1300
 
7.8%
l1180
 
7.1%
1140
 
6.8%
r1042
 
6.2%
e827
 
5.0%
n695
 
4.2%
d622
 
3.7%
Other values (35)4802
28.8%

priority
Categorical

Imbalance 

Distinct49
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size54.3 KiB
High
802 
Urgent
105 
High urgency
 
12
High (urgent action required)
 
10
Medium
 
7
Other values (44)
 
63

Length

Max length107
Median length4
Mean length6.4934935
Min length4

Characters and Unicode

Total characters6487
Distinct characters42
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)3.5%

Sample

1st rowUrgent
2nd rowHigh
3rd rowHigh
4th rowHigh
5th rowHigh

Common Values

ValueCountFrequency (%)
High802
80.3%
Urgent105
 
10.5%
High urgency12
 
1.2%
High (urgent action required)10
 
1.0%
Medium7
 
0.7%
Moderate5
 
0.5%
High Priority5
 
0.5%
High (Urgent action required)5
 
0.5%
High (Immediate Action Required)3
 
0.3%
High (Urgent)2
 
0.2%
Other values (39)43
 
4.3%

Length

2025-10-01T13:16:26.940480image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
high872
67.3%
urgent136
 
10.5%
action31
 
2.4%
required27
 
2.1%
urgency18
 
1.4%
immediate15
 
1.2%
the12
 
0.9%
needed11
 
0.8%
moderate10
 
0.8%
to10
 
0.8%
Other values (63)153
 
11.8%

Most occurring characters

ValueCountFrequency (%)
i1055
16.3%
g1050
16.2%
h915
14.1%
H870
13.4%
e391
 
6.0%
t297
 
4.6%
296
 
4.6%
n272
 
4.2%
r261
 
4.0%
U119
 
1.8%
Other values (32)961
14.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)6487
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i1055
16.3%
g1050
16.2%
h915
14.1%
H870
13.4%
e391
 
6.0%
t297
 
4.6%
296
 
4.6%
n272
 
4.2%
r261
 
4.0%
U119
 
1.8%
Other values (32)961
14.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)6487
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i1055
16.3%
g1050
16.2%
h915
14.1%
H870
13.4%
e391
 
6.0%
t297
 
4.6%
296
 
4.6%
n272
 
4.2%
r261
 
4.0%
U119
 
1.8%
Other values (32)961
14.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)6487
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i1055
16.3%
g1050
16.2%
h915
14.1%
H870
13.4%
e391
 
6.0%
t297
 
4.6%
296
 
4.6%
n272
 
4.2%
r261
 
4.0%
U119
 
1.8%
Other values (32)961
14.8%

has_missing_fields
Boolean

Constant 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
False
999 
ValueCountFrequency (%)
False999
100.0%
2025-10-01T13:16:26.960882image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

has_transcript_issues
Boolean

Constant 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
False
999 
ValueCountFrequency (%)
False999
100.0%
2025-10-01T13:16:26.971246image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

has_summary_issues
Boolean

Constant 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
False
999 
ValueCountFrequency (%)
False999
100.0%
2025-10-01T13:16:26.980895image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
True
611 
False
388 
ValueCountFrequency (%)
True611
61.2%
False388
38.8%
2025-10-01T13:16:26.991864image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Distinct269
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Memory size68.5 KiB
2025-10-01T13:16:27.045878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length88
Median length70
Mean length21.062062
Min length5

Characters and Unicode

Total characters21041
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique190 ?
Unique (%)19.0%

Sample

1st row5-year-old girl
2nd row13-year-old sister
3rd rowSammy (14 years old)
4th row12-year-old brother
5th row13-year-old sister
ValueCountFrequency (%)
sister403
14.7%
12-year-old367
 
13.4%
girl208
 
7.6%
niece165
 
6.0%
13-year-old142
 
5.2%
5-year-old124
 
4.5%
14-year-old113
 
4.1%
7-year-old101
 
3.7%
daughter94
 
3.4%
a90
 
3.3%
Other values (185)927
33.9%
2025-10-01T13:16:27.150982image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e2190
 
10.4%
r2015
 
9.6%
-1849
 
8.8%
1735
 
8.2%
o1405
 
6.7%
l1286
 
6.1%
a1246
 
5.9%
d1196
 
5.7%
s1103
 
5.2%
i1059
 
5.0%
Other values (44)5957
28.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)21041
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e2190
 
10.4%
r2015
 
9.6%
-1849
 
8.8%
1735
 
8.2%
o1405
 
6.7%
l1286
 
6.1%
a1246
 
5.9%
d1196
 
5.7%
s1103
 
5.2%
i1059
 
5.0%
Other values (44)5957
28.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)21041
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e2190
 
10.4%
r2015
 
9.6%
-1849
 
8.8%
1735
 
8.2%
o1405
 
6.7%
l1286
 
6.1%
a1246
 
5.9%
d1196
 
5.7%
s1103
 
5.2%
i1059
 
5.0%
Other values (44)5957
28.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)21041
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e2190
 
10.4%
r2015
 
9.6%
-1849
 
8.8%
1735
 
8.2%
o1405
 
6.7%
l1286
 
6.1%
a1246
 
5.9%
d1196
 
5.7%
s1103
 
5.2%
i1059
 
5.0%
Other values (44)5957
28.3%
Distinct442
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Memory size69.0 KiB
2025-10-01T13:16:27.235644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length120
Median length69
Mean length21.560561
Min length4

Characters and Unicode

Total characters21539
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique360 ?
Unique (%)36.0%

Sample

1st rowUnknown, but a local factory is suspected
2nd rowLocal factory owners
3rd rowUnknown (employer at a local factory)
4th rowUnspecified local workshop owner or workers
5th rowUnnamed factory owners
ValueCountFrequency (%)
factory305
 
9.4%
the200
 
6.2%
local188
 
5.8%
unknown138
 
4.2%
neighbor118
 
3.6%
owner114
 
3.5%
family112
 
3.4%
owners108
 
3.3%
not107
 
3.3%
mother100
 
3.1%
Other values (251)1758
54.1%
2025-10-01T13:16:27.367453image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2249
 
10.4%
e2084
 
9.7%
o1836
 
8.5%
r1512
 
7.0%
a1324
 
6.1%
n1313
 
6.1%
t1196
 
5.6%
i1065
 
4.9%
s865
 
4.0%
c814
 
3.8%
Other values (45)7281
33.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)21539
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2249
 
10.4%
e2084
 
9.7%
o1836
 
8.5%
r1512
 
7.0%
a1324
 
6.1%
n1313
 
6.1%
t1196
 
5.6%
i1065
 
4.9%
s865
 
4.0%
c814
 
3.8%
Other values (45)7281
33.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)21539
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2249
 
10.4%
e2084
 
9.7%
o1836
 
8.5%
r1512
 
7.0%
a1324
 
6.1%
n1313
 
6.1%
t1196
 
5.6%
i1065
 
4.9%
s865
 
4.0%
c814
 
3.8%
Other values (45)7281
33.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)21539
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2249
 
10.4%
e2084
 
9.7%
o1836
 
8.5%
r1512
 
7.0%
a1324
 
6.1%
n1313
 
6.1%
t1196
 
5.6%
i1065
 
4.9%
s865
 
4.0%
c814
 
3.8%
Other values (45)7281
33.8%
Distinct549
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Memory size89.1 KiB
2025-10-01T13:16:27.428550image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length97
Median length73
Mean length41.943944
Min length10

Characters and Unicode

Total characters41902
Distinct characters59
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique440 ?
Unique (%)44.0%

Sample

1st rowMombasa Child Welfare Society and the police
2nd row["Nairobi Children's Office, Relevant labor authorities"]
3rd row['Mombasa Labor Commission, Police']
4th rowKisumu Child Protection Unit and police
5th row["Kisumu Children's Office", 'local police']
ValueCountFrequency (%)
police909
16.9%
office796
14.8%
children's614
11.4%
and456
 
8.5%
local305
 
5.7%
labor283
 
5.2%
mombasa257
 
4.8%
child229
 
4.2%
mwanza222
 
4.1%
kisumu153
 
2.8%
Other values (127)1168
21.7%
2025-10-01T13:16:27.530507image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4393
 
10.5%
i3529
 
8.4%
e3250
 
7.8%
a2611
 
6.2%
l2547
 
6.1%
o2546
 
6.1%
c2366
 
5.6%
'1842
 
4.4%
n1835
 
4.4%
f1688
 
4.0%
Other values (49)15295
36.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)41902
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4393
 
10.5%
i3529
 
8.4%
e3250
 
7.8%
a2611
 
6.2%
l2547
 
6.1%
o2546
 
6.1%
c2366
 
5.6%
'1842
 
4.4%
n1835
 
4.4%
f1688
 
4.0%
Other values (49)15295
36.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)41902
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4393
 
10.5%
i3529
 
8.4%
e3250
 
7.8%
a2611
 
6.2%
l2547
 
6.1%
o2546
 
6.1%
c2366
 
5.6%
'1842
 
4.4%
n1835
 
4.4%
f1688
 
4.0%
Other values (49)15295
36.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)41902
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4393
 
10.5%
i3529
 
8.4%
e3250
 
7.8%
a2611
 
6.2%
l2547
 
6.1%
o2546
 
6.1%
c2366
 
5.6%
'1842
 
4.4%
n1835
 
4.4%
f1688
 
4.0%
Other values (49)15295
36.5%
Distinct639
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Memory size100.6 KiB
2025-10-01T13:16:27.618936image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length182
Median length119
Mean length53.996997
Min length18

Characters and Unicode

Total characters53943
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique559 ?
Unique (%)56.0%

Sample

1st rowImmediate action required: report to child welfare society and police
2nd rowReport to Nairobi Children's Office and relevant labor authorities
3rd rowReporting to authorities and follow-ups
4th rowReport to authorities, follow-up
5th rowInvestigation, rescue, rehabilitation and reintegration of the victim, legal action against perpetrators
ValueCountFrequency (%)
to856
 
10.9%
and731
 
9.3%
report701
 
9.0%
authorities648
 
8.3%
the515
 
6.6%
follow-up352
 
4.5%
immediate256
 
3.3%
up202
 
2.6%
follow198
 
2.5%
support188
 
2.4%
Other values (248)3172
40.6%
2025-10-01T13:16:27.756482image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6820
12.6%
t5392
 
10.0%
o5299
 
9.8%
e5063
 
9.4%
i3692
 
6.8%
r3380
 
6.3%
a3044
 
5.6%
p2486
 
4.6%
l2387
 
4.4%
n2163
 
4.0%
Other values (45)14217
26.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)53943
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
6820
12.6%
t5392
 
10.0%
o5299
 
9.8%
e5063
 
9.4%
i3692
 
6.8%
r3380
 
6.3%
a3044
 
5.6%
p2486
 
4.6%
l2387
 
4.4%
n2163
 
4.0%
Other values (45)14217
26.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)53943
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
6820
12.6%
t5392
 
10.0%
o5299
 
9.8%
e5063
 
9.4%
i3692
 
6.8%
r3380
 
6.3%
a3044
 
5.6%
p2486
 
4.6%
l2387
 
4.4%
n2163
 
4.0%
Other values (45)14217
26.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)53943
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
6820
12.6%
t5392
 
10.0%
o5299
 
9.8%
e5063
 
9.4%
i3692
 
6.8%
r3380
 
6.3%
a3044
 
5.6%
p2486
 
4.6%
l2387
 
4.4%
n2163
 
4.0%
Other values (45)14217
26.4%

Interactions

2025-10-01T13:16:25.443271image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.550833image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.726513image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.899860image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.085209image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.271358image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.473175image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.579021image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.755081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.930496image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.115467image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.299258image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.502043image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.606788image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.782044image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.960302image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.145678image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.327547image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.533593image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.637487image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.812312image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.991639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.177871image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.357147image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.565120image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.668090image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.842402image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.023801image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.209704image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.387452image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.594716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.696900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:24.870754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.053495image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.239588image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-01T13:16:25.414391image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-10-01T13:16:27.787157image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
compression_ratiohas_consistency_issuesissueline_numberprioritysummary_lengthtranscript_lengthword_count_summaryword_count_transcript
compression_ratio1.0000.0610.000-0.0220.0000.894-0.2630.865-0.248
has_consistency_issues0.0611.0000.0900.0000.0880.0200.0900.0770.169
issue0.0000.0901.0000.0290.0480.0000.1230.0610.107
line_number-0.0220.0000.0291.0000.000-0.033-0.022-0.042-0.027
priority0.0000.0880.0480.0001.0000.0000.0930.0000.057
summary_length0.8940.0200.000-0.0330.0001.0000.1390.9650.131
transcript_length-0.2630.0900.123-0.0220.0930.1391.0000.1340.934
word_count_summary0.8650.0770.061-0.0420.0000.9650.1341.0000.140
word_count_transcript-0.2480.1690.107-0.0270.0570.1310.9340.1401.000

Missing values

2025-10-01T13:16:25.647008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-10-01T13:16:25.698109image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

record_idline_numbertranscript_lengthsummary_lengthcompression_ratioword_count_transcriptword_count_summarynamelocationissuecategorypriorityhas_missing_fieldshas_transcript_issueshas_summary_issueshas_consistency_issuesvictim_infoperpetrator_inforeferral_infointervention_info
0line_116631800.27149312329AhmedMombasaChild LaborChild ExploitationUrgentFalseFalseFalseFalse5-year-old girlUnknown, but a local factory is suspectedMombasa Child Welfare Society and the policeImmediate action required: report to child welfare society and police
1line_226122650.43300710642DanielKisumuForced child laborChild LaborHighFalseFalseFalseTrue13-year-old sisterLocal factory owners["Nairobi Children's Office, Relevant labor authorities"]Report to Nairobi Children's Office and relevant labor authorities
2line_336233480.55858712059DanielMombasaChild laborLabor exploitationHighFalseFalseFalseFalseSammy (14 years old)Unknown (employer at a local factory)['Mombasa Labor Commission, Police']Reporting to authorities and follow-ups
3line_446032380.39469310840JaneKisumuChild laborChild exploitationHighFalseFalseFalseTrue12-year-old brotherUnspecified local workshop owner or workersKisumu Child Protection Unit and policeReport to authorities, follow-up
4line_556851760.25693412728JohnNarokChild LaborWork ExploitationHighFalseFalseFalseFalse13-year-old sisterUnnamed factory owners["Kisumu Children's Office", 'local police']Investigation, rescue, rehabilitation and reintegration of the victim, legal action against perpetrators
5line_666062640.43564410743JohnMwanzaForced child marriageChild protectionUrgentFalseFalseFalseFalse12-year-old nieceFamily members["Mwanza Children's Office", 'police']Report to the authorities and seek further assistance
6line_775942850.47979810145SamBusiaChild marriage and emotional abuseChild marriage, Emotional abuseHigh (urgent intervention needed)FalseFalseFalseTrue14-year-old sisterhusbandBusia Children's Office, PoliceFollow-up calls with the caller
7line_885772700.46793810744SamMwanzaChild laborLabor exploitationHighFalseFalseFalseTrue7-year-old sisterLocal factoryMwanza Child Welfare Services and policeImmediate reporting to authorities followed by follow-up
8line_996612290.34644511936JamesKisumu, KenyaEmotional abuseChild ProtectionHighFalseFalseFalseTrue12-year-old girlMother["Kisumu Children's Office", 'Local police']Report to authorities and follow-up with Childline
9line_10106142820.45928311047JohnKisumuForced child marriageChild protectionHighFalseFalseFalseFalse14-year-old daughter of a neighborNot specified['Kisumu Child Protection Unit', 'local police']Report to authorities and follow-up by helpline
record_idline_numbertranscript_lengthsummary_lengthcompression_ratioword_count_transcriptword_count_summarynamelocationissuecategorypriorityhas_missing_fieldshas_transcript_issueshas_summary_issueshas_consistency_issuesvictim_infoperpetrator_inforeferral_infointervention_info
989line_9919916613210.48562811250DavidMombasaChild laborLabor ExploitationHighFalseFalseFalseTrueA 14-year-old girlThe victim's neighborThe Children's Office and local policeImmediate report and action by authorities
990line_9929926821930.28299112231HassanMwanzaChild LaborChild LaborHighFalseFalseFalseTrue5-year-old sisterUnnamed family members at a local factoryMwanza Labor Office and policeReport to authorities and offer support
991line_9939936592840.43095611642SamKisumuChild LaborLabor ExploitationHighFalseFalseFalseTrue12-year-old sisterLocal factory owners["Kisumu Children's Office", 'Police']Report the issue to the appropriate authorities
992line_9949945722440.4265739940JamesKisumu, KenyaChild marriageChild protection issuesHighFalseFalseFalseTrueAround 13-year-old girlThe child's family (not specified)Children's Office and local policeImmediate action: Report the case to the Children's Office and local police
993line_9959956012340.38935110837JohnMwanzaChild laborChild exploitationHighFalseFalseFalseTrue13-year-old sisterLocal factory owner["Mwanza Children's Office", 'Labor inspectorate']Report to local authorities and follow-up
994line_9969966443430.53260911154IsaacMwanza, TanzaniaChild laborChild abuse and exploitationHighFalseFalseFalseTrue5-year-old daughter of Isaac's sisterIsaac's sister['Mwanza Child Welfare Office', 'local police']Report to authorities, follow-up support
995line_9979976102400.39344310740JamalKisumuEmotional abuseChild protectionHighFalseFalseFalseTrue12-year-old sisterStepmother["Kisumu Children's Office", 'Police']Report to authorities and offer support
996line_9989985771780.30849211131JamesNakuruChild LaborLabor ExploitationHighFalseFalseFalseFalse5-year-old sisterLocal shop owner['Nakuru Children’s Office, Police']Immediate reporting to authorities and follow-up
997line_9999996582440.37082111333MosesTororoChild LaborLabor ExploitationHighFalseFalseFalseTrue5-year-old sisterNot specifiedTororo Children’s Office, police, and local authoritiesImmediate reporting and ongoing support
998line_100010005461580.28937710025JohnMwanzaChild LaborChild ProtectionHighFalseFalseFalseTrue12-year-old nephewFactory OwnerMwanza Labor Office and the policeImmediate reporting to authorities, follow-up, and potential rescue operation